Scientists with datasets (Breast Cancer Area) Github repo
We use HetGNN to represent node embeddings in the Breast Cancer area based on four kinds of nodes: scientists, datasets, papers, and bioentities and their writing, collaboration, mention, citing, and use relationships.
Here are the results of K-means clustering (k=5) and T-SNE. The cluster labels are based on the bioentities and TF-IDF.